Principal Components Regression With Data Chosen Components and Related Methods

نویسندگان

  • J. T. Gene Hwang
  • Dan Nettleton
چکیده

Multiple regression with correlated predictor variables is relevant to a broad range of problems in the physical, chemical, and engineering sciences. Chemometricians, in particular, have made heavy use of principal components regression and related procedures for predicting a response variable from a large number of highly correlated predictors. In this paper we develop a general theory that guides us in choosing principal components that yield very good estimates of regression coefficients. Our numerical results suggest that the theory also can be used to improve partial least squares regression estimators and regression estimators based on rotated principal components. Our methods also provide insight about the subspace of the predictor matrix that explains the response best.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Choosing the Best Hierarchical Clustering Technique Based on Principal Components Analysis for Suspended Sediment Load Estimation

1- INTRODUCTION The assessment of watershed sediment load is necessary for controling soil erosion and reducing the potential of sediment production. Different estimates of sediment amounts along with the lack of long-term measurements limits the accessibility to reliable data series of erosion rate and sediment yield. Therefore, the observed data of suspended sediment load could be used to ...

متن کامل

Derivation of regression models for pan evaporation estimation

Evaporation is an essential component of hydrological cycle. Several meteorologicalfactors play role in the amount of pan evaporation. These factors are often related to eachother. In this study, a multiple linear regression (MLR) in conjunction with PrincipalComponent Analysis (PCA) was used for modeling of pan evaporation. After thestandardization of the variables, independent components were...

متن کامل

به‌کارگیری متغیرهای پنهان در مدل رگرسیون لجستیک برای حذف اثر هم‌خطی چندگانه در تحلیل برخی عوامل مرتبط با سرطان پستان

Background and Objectives: Logistic regression is one of the most widely used generalized linear models for analysis of the relationships between one or more explanatory variables and a categorical response. Strong correlations among explanatory variables (multicollinearity) reduce the efficiency of model to a considerable degree. In this study we used latent variables to reduce the effects of ...

متن کامل

Analysis of physiochemical and microbial quality of waters of the Karkheh River in southwestern Iran using multivariate statistical methods

Rapid population growth as well as agricultural and industrial development have increased the contamination of Iranian rivers. This study utilized principal components analysis (PCA) to determine the degree of significance of qualitative parameters of water resources in the Karkheh River in southwestern Iran. Cluster analysis (CA) grouped the monitoring stations based on the water quality data ...

متن کامل

STA 4107/5107 Statistical Learning: Principle Components and Partial Least Squares Regression

Principal components analysis is traditionally presented as an interpretive multivariate technique, where the loadings are chosen to maximally explain the variance in the variable. However, we will consider it here mainly as a statistical learning tool, by using the derived components in a least squares regression to predict unobserved response variables using the principal components. Principa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Technometrics

دوره 45  شماره 

صفحات  -

تاریخ انتشار 2003